Audio-visual sensor fusion with neural architectures

نویسندگان

Barbara Helga Talle

Andreas Wichert

چکیده

In this paper we present a new word recognition system for monosyllabic words consisting of two types of neural networks which allows in an easy way the investigation of three different fusion architectures for audio-visual signals. Furthermore, two different kinds of preprocessing are compared: Besides low level data, a linear discriminant analysis is used for the audio and visual signals to reduce the dimensionality. Our cross-validation experiments show a slight advantage for an intermediate fusion model compared with an early fusion model which uses jointly preprocessed audio and visual data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Focus Image Fusion in DCT Domain using Variance and Energy of Laplacian and Correlation Coefficient for Visual Sensor Networks

The purpose of multi-focus image fusion is gathering the essential information and the focused parts from the input multi-focus images into a single image. These multi-focus images are captured with different depths of focus of cameras. A lot of multi-focus image fusion techniques have been introduced using considering the focus measurement in the spatial domain. However, the multi-focus image ...

متن کامل

Fusion of Multi-Sensor Imagery for Night Vision: Color Visualization, Target Learning and Search

1 This work was sponsored by the U.S. Defense Advanced Research Projects Agency, under Air Force Contract F19628-95-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors and not necessarily endorsed by the U.S. Air Force. 2 Present Address: MCIS Department, Jacksonville State University, Jacksonville, AL 36265, U.S.A. Abstract We present methods and result...

متن کامل

Resource aware design of a deep convolutional-recurrent neural network for speech recognition through audio-visual sensor fusion

Today’s Automatic Speech Recognition systems only rely on acoustic signals and often don’t perform well under noisy conditions. Performing multi-modal speech recognition processing acoustic speech signals and lip-reading video simultaneously significantly enhances the performance of such systems, especially in noisy environments. This work presents the design of such an audio-visual system for ...

متن کامل

Sensor Fusion Weighting Measures in Audio-Visual Speech Recognition

Audio-Visual Speech Recognition (AVSR) uses vision to enhance speech recognition but also introduces the problem of how to join (or fuse) these two signals together. Mainstream research achieves this using a weighted product of the output of the phoneme classifiers for both modalities. This paper analyses current weighting measures and compares them to several new measures proposed by the autho...

متن کامل

Sensor Fusion for Mobile Robot Navigation - Proceedings of the IEEE

We review techniques for sensor fusion in robot navigation, emphasizing algorithms for self-location. These find use when the sensor suite of a mobile robot comprises several different sensors, some complementary and some redundant. Integrating the sensor readings, the robot seeks to accomplish tasks such as constructing a map of its environment, locating itself in that map, and recognizing obj...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Audio-visual sensor fusion with neural architectures

نویسندگان

چکیده

منابع مشابه

Multi-Focus Image Fusion in DCT Domain using Variance and Energy of Laplacian and Correlation Coefficient for Visual Sensor Networks

Fusion of Multi-Sensor Imagery for Night Vision: Color Visualization, Target Learning and Search

Resource aware design of a deep convolutional-recurrent neural network for speech recognition through audio-visual sensor fusion

Sensor Fusion Weighting Measures in Audio-Visual Speech Recognition

Sensor Fusion for Mobile Robot Navigation - Proceedings of the IEEE

عنوان ژورنال:

اشتراک گذاری